专利摘要:
Method and Device for Encoding and Decoding Video The embodiments of the present invention provide a method and device for encoding and decoding video, and refer to the field of communication, and an efficient transformation matrix corresponding to the characteristic of each residual block is selected. for processing, which therefore improves coding efficiency. The solution provided in one embodiment of the present invention is: generating a prediction residue according to the input video data; select a set of best transformation matrices from multiple candidate transformation matrices according to an intraframe prediction mode criterion and distortion rate to perform transform coding on the prediction residue and obtain a transformation result, and generate a coded flow according to the transformation result and selected transformation matrix index information.
公开号:BR112012011325B1
申请号:R112012011325-9
申请日:2010-08-30
公开日:2019-04-30
发明作者:Mingyuan Yang;Dong Wang;Lianhuan Xiong;Xin Zhao;Li Zhang;Siwei Ma;Wen Gao
申请人:Huawei Technologies Co., Ltd.;
IPC主号:
专利说明:

Invention Patent Descriptive Report for METHOD FOR ENCODING VIDEO DATA, METHOD FOR DECODING VIDEO, VIDEO DATA ENCODER AND VIDEO DECODER.
FIELD OF THE INVENTION
The present invention relates to the field of communications, and in particular, to a method and device for encoding and decoding videos.
BACKGROUND OF THE INVENTION
A complete system for encoding and decoding videos includes an encoding part and a decoding part. Generally, on one side of the encoder under a hybrid encoding structure, the video signals pass through a first prediction module. The encoder selects the best mode in several prediction modes according to certain optimization criteria, and then generates residual signals. The residual signals are trans γ formed and quantized, and then sent to an entropy coding module, and finally form output streams. On the decoder side, the output streams are resolved to obtain prediction mode information, and a predicted signal that is completely the same as the signal in the coder is generated. After that, a quantized transformation coefficient value is obtained from the resolved flows, and inverse and inverse transformed quantization are performed to generate a reconstructed residual signal. Finally, the predicted signal and the residual reconstructed signal are combined to form a reconstructed video signal.
Under a hybrid coding framework, a key technology in the coding process is transformation. The function of the transformation is: to transform a residue into another expression through a linear operation on a residual block, and under such an expression, the data energy is centralized in a few transformation coefficients, and the energy of most other coefficients is very low or even zero. Through such a transformation, subsequent entropy coding can be performed efficiently. In video encoding, for a residual block X, if X is
2/43 considered as a matrix, the transformation is actually multiplying matrices. One form of multiplication is F = C · X · R, where C and R are transformation matrices, whose dimensions are the same as the dimensions of X, and F is a transformation coefficient matrix as a result of the transformation. In comparison with other types of transformation in the prior art, discrete cosine transform (Discrete Cosine Transform, DCT) is a better balance between complexity and performance and is therefore widely applied.
In video encoding technology, a mode-dependent directional trans10 formed technology (Mode-dependent Directional Transform, MDDT) is adopted. The essence of MDDT is:
(1) Residues obtained through different intraframe prediction modes reflect different statistical characteristics and, therefore, different transformation matrices must be used, according to different prediction diections, to improve compression coding efficiency. , and (2) to reduce the complexity of the transformation, the MDDT separates the column rows, and generates a pair of transformation matrices, namely, a column transformation matrix Ci and a row transformation matrix Ri, and, therefore, the transformation process is Fi = Ci · X · 20 Ri, where i is a corresponding intraframe prediction mode, X is a prediction residue, and Fi is a transformed prediction residue; Ci and Ri show that the horizontal transformation is separated from the vertical transformation by a matrix Ci and a matrix Ri, which is known as transformation with columns separated from the rows.
In the process of implementing the above transformation, at least the following problems are encountered in the prior art:
Although MDDT technology can apply a different set of transformation matrices in different directions for prediction of coding intraframe, in a practical coding process the characteristics of residual data statistics still obviously differ, even if the intraframe prediction mode is the same. Therefore, the previous method, in which an intraframe prediction mode corresponds to a con
3/43 with transformation matrices, is not yet sufficiently accurate, and leads to subsequent low coding efficiency.
SUMMARY OF THE INVENTION
One embodiment of the present invention provides a method and device for encoding and decoding videos, where an efficient transformation matrix corresponding to the characteristics of each residual block is selected for the transformation, which, therefore, improves the encoding efficiency.
To fulfill the above objectives, the modalities of the present invention provide the following technical solutions:
A method for encoding video data, including:
generate a prediction residue according to the input video data;
selecting a set of best transformation matrices i 15 among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform coding on the prediction residue and obtain a transformation result; and generating a coded flow according to the result of the transformation and transformation matrix index information selected.
A video data encoder, including:
a waste generating unit, configured to generate a prediction waste according to the input video data;
a transformation unit, configured to select a set of best transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform coding on the prediction residue and obtain a result of transformation; and a flow generation unit, configured to generate a flow encoded according to the transformation result and selected transformation matrix index information.
A method for decoding video data, including:
4/43 resolution of an encoded video stream to obtain a calculation result and transform coefficient matrix index information.
determination of the transformation coefficient matrix between the multiple candidate transformation matrices according to the index information and an intraframe prediction mode, using the transformation coefficient matrix to perform inverse transformation on the calculation result to obtain residual data, and reconstruct the video data according to the residual data.
A video decoder, including:
a resolution unit, configured for resolution of a video stream to obtain a calculation result and encode transformation coefficient matrix index information;
a determination unit, configured to determine a • transformation coefficient matrix between the multiple candidate transformation matrices according to the index information and an intraframe prediction mode, and a reconstruction unit, configured to use the matrix of transformation transformation coefficient to perform the reverse transformation on 20 the calculation result to obtain residual data, and reconstruct the video data according to the residual data.
A method for encoding video data, including:
generate a prediction residue according to the input video data;
select a set of best transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and optimization criteria to perform transform coding in the prediction residue and obtain a transformation result; and generate a coded flow according to the result of the transformation and selected transformation matrix index information.
A method of video decoding, including:
5/43 resolution of an encoded video stream to obtain a transformation result and transformation matrix index information; and determine a set of transformation matrices among multiple candidate transformation matrices according to the transformation matrix index information and an intraframe prediction mode, using the transformation matrix set to perform the inverse transformation on the result of the transformation to obtain residual data, and reconstruct the video data accordingly to the residual data.
A method for encoding video data, including:
generate a prediction residue according to the input video data;
select a set of best transformation matrices among multiple candidate transformation matrices according to optimization criteria to perform transform coding in the prediction residue and obtain a transformation result; and encoding selected transformation matrix index information according to the transformation result and an intraframe prediction mode to generate an encoded flow.
A method of video decoding, including:
resolving an encoded video stream to obtain a transformation result, and obtaining transformation matrix index information according to an intraframe prediction mode, and determining a transformation matrix between multiple candidate transformation matrices according to the information transformation matrix index, using the transformation matrix determined to perform reverse transformation on the transformation result to obtain residual data, and reconstruct the video data according to the residual data.
A video data encoder, including:
a waste generation unit, configured to generate a prediction waste according to the input video data;
6/43 a transformation unit, configured to select a set of better transformation matrices among multiple candidate transformation matrices according to optimization criteria to perform transform coding in the prediction residue and obtain a transformation result; and a flow generation unit, configured to encode transformation matrix index information selected according to the transformation result and an intraframe prediction mode to generate an encoded flow.
A video decoder, including:
a resolution unit, configured for resolution of a video stream to obtain a transformation result, and obtain transformation matrix index information according to an intraframe prediction mode;
-15 a determination unit, configured to determine a transformation matrix between multiple candidate transformation matrices according to the transformation matrix index information, and a reconstruction unit, configured to use the determined transformation matrix to perform reverse transformation on the result of the transformation to obtain residual data, and to reconstruct the video data according to the residual data.
The method and device for encoding and decoding videos in the modalities of the present invention select a set of better transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform video encoding. transformed into the prediction residue and obtain a result of the transformation. Through such a mode for coding, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which improves the coding efficiency. In addition, transformation coefficient matrices are selected from multiple candidate transformation matrices according to the
7/43 transformation coefficient matrix and the intraframe prediction mode, inverse transformation is performed using the transformation coefficient matrices to obtain residual data, and the video data is reconstructed according to the residual data.
BRIEF DESCRIPTION OF THE DRAWINGS
To describe the technical solution of the present invention more clearly, the accompanying drawings involved in describing the modalities of the present invention or the prior art are briefly presented below. Apparently, the accompanying drawings 10 are merely illustrative, and persons skilled in the art can derive other drawings from these drawings without creative efforts.
figure 1 is a block flow chart of a video encoding method according to an embodiment of the present invention;
figure 2 is a block flow chart of a video decoding method according to an embodiment of the present invention;
figure 3 is a schematic diagram of residual change in a video encoding method according to an embodiment of the present invention;
figure 4 is a block diagram of a video encoder structure according to an embodiment of the present invention;
figure 5 is a block diagram of a video encoder structure according to another embodiment of the present invention;
figure 6 is a block diagram of a video decoder structure according to an embodiment of the present invention;
figure 7 is a block diagram of a video decoder structure according to another embodiment of the present invention;
figure 8 is a block flow chart of another video encoding method according to an embodiment of the present invention;
figure 9 is a block flow chart of another method of video decoding according to an embodiment of the present invention;
figure 10 is a block flow chart of another video encoding method according to an embodiment of the present invention;
Figure 11 is a block flow chart of another method of video decoding according to an embodiment of the present invention;
figure 12 is a block diagram of a structure of another video encoder according to an embodiment of the present invention;
figure 13 is a block diagram of a structure of another video encoder according to an embodiment of the present invention;
figure 14 is a block diagram of a structure of another video decoder according to an embodiment of the present invention; and figure 15 is a block diagram of a structure of another video decoder according to an embodiment of the present invention.
DETAILED DESCRIPTION OF THE MODALITIES
The following detailed description is given together with the accompanying terms, in order to provide a clear and complete understanding of the present invention. Of course, the drawings and the detailed description are merely representative of particular embodiments of the present invention, rather than all of the embodiments. All other modalities that can be derived by those skilled in the art from the modalities given here without creative efforts, must fall within the scope of protection of the present invention.
As shown in figure 1, a method for encoding video data in an embodiment of the present invention includes the following steps:
S101: Generate a prediction residue according to input video data.
S102: Select a set of best transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria 30 to perform transform coding on the prediction residue and obtain a transformation result.
In the transformation process, how to separate a column
9/43 of a line can be applied. That is, according to the intraframe prediction mode, cross all possible combinations of a column transformation matrix and a line transformation matrix in multiple candidate transformation matrices, select a combination of 5 transformation at a fee cost of minimal distortion-after matrix multiplication as a transformation coefficient matrix, and obtain a transformation result.
S103: Generate a coded flow according to the transformation result and selected transformation matrix index information10.
In addition, the method may include a coefficient scanning process: selecting a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to scan a transformed coefficient.
-15 Then, the one with the minimum distortion rate cost after transformation is selected as the best intraframe prediction mode, and its result is quantized and then subjected to entropy coding.
In addition, the transformation coefficient matrix index information can be written to the encoded data.
According to the video encoding method provided in this modality, a set of better transformation matrices among multiple candidate transformation matrices can be selected according to an intraframe prediction mode and distortion rate criteria to perform transform encoding in the prediction residue and
I a result of the transformation is obtained. Through such a mode for coding, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which, therefore, improves the coding efficiency.
Below, more details are given on the method of encoding video data provided in an embodiment of the present invention with reference to figure 1:
10/43
S101: Generate a prediction residue according to input video data.
S102: Select a set of best transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform coding on the prediction residue and obtain a transformation result.
In this embodiment, the selected set of best transformation matrices can be a non-separate transformation matrix or 10 can be a pair of transformation matrices, which include a column transformation matrix and a row transformation matrix.
In this modality, a set of best transformation matrices is selected among multiple candidate transformation matrices according to the intraframe prediction mode and the -15 distortion rate criteria to perform transform coding on the prediction residue and obtain a result of the transformation. In other words, according to the intraframe prediction mode, transform coding is performed on the prediction residue using multiple candidate transformation matrices, a set of better transformation matrices is selected20 according to the distortion rate criteria, and the transformation result corresponding to the set of best transformation matrices is used together with the transformation matrix index information selected subsequently to generate an encoded flow.
In the transformation process, the method of separating the column from the row can be applied. That is: according to the intraframe prediction mode, cross all possible combinations of the column transformation matrix and the line transformation matrix in multiple candidate transformation matrices, select the transformation combination with the cost of distortion rate minimal after matrix multiplication with a transformation matrix, and obtain a transformation result.
In other words, the details of this mode are: according to the intraframe prediction mode, cross all combinations of the trans matrix
11/43 column formation and the line transformation matrix in multiple candidate transformation matrices, select the transformation combination with the minimum distortion rate cost after residual transform coding as a best transformation matrix, and use the result 5 of the transformation corresponding to the set of best transformation matrices together with the transformation matrix index information selected subsequently to generate a coded flow.
S103: Generate a coded flow according to the transformation result and selected transformation matrix index information10.
In addition, this modality can also include a coefficient scanning process: selecting a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to scan the coefficient. 15 transformed.
Then, the one with the minimum distortion rate cost after the transformation is selected as the best intraframe prediction mode, and its result is quantized and then subjected to entropy coding. That is, the prediction residue is encoded in various encoding modes, where the mode with the minimum distortion rate cost is selected as the intraframe prediction mode, and an encoding result is obtained.
In this embodiment, the flow generator encoded according to the transformation result and the selected transformation matrix index information includes: recording the transformation matrix index information in the encoded data.
If the set of best transformation matrices is a pair of transformation matrices, recording the transformation matrix index information in the encoded data includes: encoding the index information of a transformation matrix pair together, or encoding the information index of a pair of transformation matrices separately, and record the encoding result of the index information on da12 / 43 of the encoded.
Joint coding indicates that the column transformation matrix and the line transformation matrix appear in pairs, and each line transformation matrix corresponds to a 5 column transformation matrix; separate coding indicates that a column transformation matrix does not necessarily correspond to a row transformation matrix. For example, a row transformation matrix can correspond to a random column transformation matrix, which can save storage space for transformation matrices.
The video encoding method in this modality can select a set of better transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform encoding on the prediction residue and obtain a result of the transformation. .15 Through such a mode for coding, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which, therefore, improves the coding efficiency.
As shown in figure 2, a video decoding method provided in an embodiment of the present invention includes the following steps:
S201: Resolution of an encoded video stream to obtain a calculation result and encode transform coefficient matrix index information.
In addition, the method can also include an inverse coefficient scanning process: select a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information coefficient to perform scanning of inverse coefficient on the transformed coefficient.
S202: Determine the transformation coefficient matrix between the multiple candidate transformation matrices according to the index information and the intraframe prediction mode, using the coefficient matrix
13/43 transformation data to perform the inverse transformation on the calculation result to obtain residual data, and to reconstruct the video data according to the residual data.
Specifically, if the separate transformation is applied in the coding transformation process, the transformation coefficient matrix in step S202 can be determined between a set of candidate row transformation matrices and column transformation matrices according to the index information of row transformation coefficient matrix and the index information of column transformation coefficient matrix in the index information, and the intraframe prediction mode.
According to the video decoding method provided in this modality, the encoded video stream can be resolved to obtain a calculation result and encode matrix index information of .15 transformation coefficient, the transformation coefficient matrix is determined among multiple candidate transformation matrices according to the index information and the intraframe prediction mode, the transformation coefficient matrix is used to perform the reverse transformation on the calculation result to obtain residual data, and the video data is reconstructed according to residual data. In this way, decoding is performed without increasing complexity. Because the coding is based on the method provided in the previous modality, the best transformation matrix can be selected with respect to the residual characteristics, and the entropy coding efficiency is improved. Furthermore, through the decoding method provided in the present modality, the efficiency of encoding and decoding videos can be improved in total.
Below, more details are given on the video decoding method provided in an embodiment of the present invention with reference to figure 2:
S201: Resolution of an encoded video stream to obtain a calculation result and transformation matrix index information.
14/43
In this modality, the result obtained after the resolution includes the result of the transformation. That is, the calculation result used in this modality is the result of the transformation. The transformation result can include the transformation coefficient matrix obtained after transformation.
In addition, this modality also includes an inverse coefficient scanning process: select a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to perform inverse coefficient scanning about the transformed coefficient.
S202: Determine a transformation matrix between multiple candidate transformation matrices according to the transformation matrix index information and an intraframe prediction mode, use the transformation matrix determined to perform inverse transformation .15 on the calculation result for obtain residual data, and reconstruct video data according to residual data.
In this embodiment, the given transformation matrix is a set of transformation matrices, and the set of transformation matrices can be a non-separate transformation matrix or it can be a pair of transformation matrices, which include a column transformation matrix. and a line transformation matrix.
Specifically, if the separate transformation is applied in the coding transformation process, the transformation matrix in step S202 can be determined between a set of candidate row transformation matrices and column transformation matrices according to the index information of row transformation matrix and the column transformation matrix index information in the information index, and the intraframe prediction mode. The set of candidate row transformation matrices and column transformation matrices here include multiple row transformation matrices and column transformation matrices.
According to the proportional video decoding method
In this mode, the encoded video stream can be resolved to obtain a calculation result and transformation matrix index information, the transformation matrix is determined among multiple candidate transformation matrices according to the index information of the transformation matrix. 5 transformation matrix and the intraframe prediction mode, the transformation matrix is used to perform the reverse transformation for the calculation result to obtain residual data, and the video data is reconstructed according to the residual data. In this way, decoding is performed without increasing complexity. Because the coding is based on the method proposed in the previous modality, the best transformation matrix can be selected with respect to the residual characteristics, and, therefore, the entropy coding efficiency is improved. In addition, through the decoding method provided in the present modality, the efficiency of encoding and decoding videos can be improved as a whole.
.15 H.264 / AVC intraframe encoding is taken as an example to describe the method of encoding video data provided in this modality.
Step 1: In the H.264 / AVC intraframe encoding process, for each macroblock, the macroblock is encoded using the existing I4MB 20 mode, 116MB mode, and the I8MB mode, first, the bit rate of each mode is recorded as RJ4MB, RI 16MB, and R I8MB respectively, and the distortion is recorded as D I4MB, D l 16MB, and D I8MB respectively; then, the percentage distortion cost is calculated respectively: RDcost_l4MB = DJ4MB + A * R_I4MB, RDcost_l16MB = DJ16MB + A * R_I16MB 25 and RDcost_l8MB = DI16MB + A * R__I8MB, where A is a constant specified in the coding process. After that, a new macroblock coding mode, that is, the method provided in this modality, is applied. Assuming that the macroblock is encoded in an I4MB_RDOT mode, an I16MB_RDOT mode, and an I8MB_RDOT mode, and the corresponding distortion rate costs, ie RDcost_l4MB_RDOT, RDcost_l16MB_RDOT, and RDcost_l8MB_RDOT, are calculated.
The I4MB_RDOT, I16MB_RDOT, and
16/43
I8MB_RDOT are described below.
(a) When the macroblock is coded in I4MB_RDOT mode, as in an I4MB coding process, a 16 χ 16 macroblock is divided into 16 non-overlapping sub-blocks of 4 χ 4 in size. After that, the best prediction direction is selected for each sub-block. This step is different from the I4MB coding process in which: when transforming the residue, several sets of candidate transformation matrices are selected according to a current intraframe prediction mode, and transform coding is performed on the residue; bit rate R and distortion D corresponding to different combinations of transformation matrices are recorded, and percent distortion costs are calculated; and the combination of the transformation matrix with the minimum distortion rate cost is selected as the best combination, and is used for the actual encoding of the residual data. For the transformation process.15 residue, see figure 3, where, X is a prediction residue, T (X) is the transformed prediction residue, and C ° K ' 1 and Rj ° - K ' 1 are candidate transformation matrices corresponding to the prediction direction.
(b) When the macroblock is coded in I8MB RDOT mode, as in an I8MB coding process, a 16 χ 16 macroblock is divided into 4 non-overlapping sub-blocks of 8 χ 8 in size. After that, the best prediction direction is selected for each sub-block. This step is different from the I8MB coding process in which: when transforming the residues, several sets of candidate transformation matrices are selected according to the current intra25 frame prediction mode, and transform coding is performed on the residue; the bit rate R and distortion D corresponding to different combinations of transformation matrices are recorded, and the percent distortion costs are calculated; and the transformation matrix combination with the minimum distortion rate cost is selected as the best combination, and 30 is used for the actual encoding of the residual data. For the residue transformation process, see figure 6, where, X is a prediction residue, T (X) is the transformed prediction residue, and Cj Oi ~ K ' 1 and Rj 0 ' - ^ 1 are matrices of trans17 / 43 training candidates corresponding to the prediction direction.
(c) When coding the macroblock in I16MBRDOT mode, as in the 116MB encoding process, the best prediction direction is selected for each 16 χ 16 block. This step is different from the 116MB encoding process where: When transforming the residue, a given set of candidate transformation matrices is selected according to the prediction direction, and all possible combinations of column transformation matrices and line transformation matrices in the set of candidate transformation matrices are traversed10 From; the bit rate R and distortion D corresponding to different combinations of transformation matrices are recorded respectively, and the percentage distortion cost is calculated; and the combination of the transformation matrix with the minimum distortion rate cost is selected as the best combination, and is used for the actual encoding of the residual data.15.
Step 2: When the macroblock encoding mode is I4MB_RDOT, I16MB_RDOT, or I8MB_RDOT, a corresponding coefficient scan order is selected for the transformed residue of each sub-block according to the intraframe prediction mode and the matrix 20 of transformation.
Step 3: The mode with the minimum distortion rate cost is selected as the best macroblock coding mode according to the distortion rate costs corresponding to the four intraframe macroblock coding modes I4MB, 116MB, I8MB, I4MB RDOT , 25 I16MB RDOT, and I8MB RDOT that are obtained in step 1. If the best macroblock coding mode is I4MB, 116MB, or I8MB, in entropy coding for macroblock header information, an RDOT_ON syntax element is written after a CBP syntax element, and a value assigned to the RDOT_ON syntax element is 0, indicating that the technology presented is not used. If the best macroblock mode is I4MB RDOT, I16MB RDOT, or I8MB RDOT, in entropy coding for macroblock header information, the syntax element
18/43
RDOTON is written after the element of a CBP syntax, and a value assigned to the element of an RDOT_ON syntax is 1, indicating that the technology presented is used. In addition, the matrix transformation index number used by each block of the current macroblock is recorded after the RDOT_ON syntax element through entropy coding sequentially.
Specifically, the syntax change made by this modality to the H.264 video encoding standard is shown in Table 1. In each macroblock header, the RDOT ON syntax element is recorded 10 after the existing CBP syntax element. If the macroblock mode is I4MB, 116MB, or I8MB, the RDOT ON value is 0; or, if the macroblock mode is I4MB RDOT, I16MB_RDOT, or I8MB RDOT, the RDOT ON value is 1. If the RDOT ON value is 1, that is, the macroblock mode is I4MB_RDOT, I16MB RDOT, or I8MB_RDOT, the Trans.15 form_matrix_index syntax element (transform matrix index) is written after the RDOT_ON syntax element, where the Transform matrix index syntax element includes the transformation matrix index number selected by each block in the macroblock.
MB mode (MB mode) I4MB I4MB_RDOT Syntax element (Syntax element) MB TYPE = 9 MB TYPE = 9 T ransform size flag T ransform size flag Intra 4x4 mode lntra 4x4 mode Chroma intra mode Chroma intra mode CBP CBP RDOT ON = 0 RDOT ON = 1 Delta QP Transform matrix index Luma Coeff Delta QP Chroma Coeff Luma CoeffChroma Coeff MB mode (MB mode) I8MB I8MB_RDOT Syntax element (Syntax element) MB TYPE = 9 MB TYPE = 9 Transform size flag T ransform size flag Intra 8x8 mode Intra 8x8 mode Chroma intra mode Chroma intra mode CBP CBP RDOT ON = 0 RDOT ON = 1 Delta QP Transform matrix index Luma Coeff Delta QP Chroma Coeff Luma CoeffChroma Coeff
19/43
MB mode (MB mode) 116MB I16MB_RDOT Syntax element (Syntax element) MB TYPE = 10 MB TYPE = 10 Ch roma intra mode Chroma intra mode RDOT ON = 0 RDOT ON = 1 Delta QP Transform matrix index Luma Coeff Delta QP Chroma Coeff Luma CoeffChroma Coeff MB mode (MB mode) I4MB I4MB_RDOT Syntax element (Syntax element) MB TYPE = 9 MB TYPE = 9 T ransform size flag T ransform size flag Intra 4x4 mode Intra 4x4 mode Chroma intra mode Chroma intra mode CBP CBP RDOT ON = 0 RDOT ON = 1 Delta QP Transform matrix index Luma Coeff Delta QP Chroma Coeff Luma CoeffChroma Coeff MB mode (MB mode) 116MB I16MB_RDOT Syntax element (Syntax element) MB TYPE = 10 MB TYPE = 10 Chroma intra mode Chroma intra mode RDOT ON = 0 RDOT ON = 1 Delta QP Transform matrix index Luma Coeff Delta QP Chroma Coeff Luma CoeffChroma Coeff
Finally, ο KTA2.4 is used as a platform, and the following definitions are applied: the complete encoding of l-frame, CABAC, and 4 QP tested points for each sequence being 22, 27, 32 and 37. The encoding performance based on the method provided in this fashion of the present invention is compared to the coding performance based on the MDDT in the prior art, and the average APSNR is calculated.
Table 2 shows the measured results of the QCIF sequence.
Table 2 Measured Results of the QCIF Sequence
Sequence Format APSNR (dB) Bus QCIF 0.2603 Soccer QCIF 0.1662 Tempete QCIF 0.2423 Bodyguard QCIF 0.1498 Container QCIF 0.2036 Foreman QCIF 0.083 Hall QCIF 0.2408
20/43
Table 2 Measured Results of the QCIF Sequence (continued)
Sequence Format APSNR (dB) Mother QCIF 0.0519 Silence QCIF 0.1113 Paris QCIF 0.2400
Table 3 shows the measured results of the ICF sequence
Sequence Format APSNR (dB) Flower CIF 0.2596 Mobile CIF 0.3146 Paris CIF 0.1717 Stefan SIF 0.2767 Bus CIF 0.2398 Bodyguard CIF 0.1469 Container CIF 0.1911 Soccer CIF 0.1017 Foreman CIF 0.0740 Hall CIF 0.2123 Silence CIF 0.0900 Tempete CIF 0.1070
The table above shows that the method provided in this modality, obviously, improves performance, compared to the MDDT transformation method.
Following is an analysis of the complexity of the method provided in this modality.
Next, the luminance is analyzed.
On the decoder side, the complexity of the method disclosed in this modality differs from the complexity of the MDDT transformation method in the following two aspects:
(1) Regarding the method provided in this modality, the decoder needs to perform the decoding entropy in the RDOT_ON syntax element recently added in each macroblock header. If RDOT ON = 1, the decoder additionally needs to decode the macroblock header to obtain the transformation matrix index number used by each block in the macroblock.
In comparison with MDDT technology, this part of the operation increases the complexity found in entropy decoding for the two newly added syntax elements: RDOT_ON flag and the
21/43 transformation matrix index number. However, the complexity of this part is unknown to the complexity of the decoding process as a whole.
(2) For a macroblock (RDOT ON = 1) to which the pro5 method provided in this mode is applied, the decoder needs to select a corresponding coefficient scan order and transformation matrix according to the transformation matrix index number obtained through decoding.
This part of the operation is as complex as the technology
MDDT, but requires additional storage space to store the candidate transformation matrix and the coefficient scan order. The I4MB mode has 9 prediction directions, and therefore, if 2 candidate transform line matrices and 2 candidate column transformation matrices exist in each direction and each element of the .15 transformation matrix is an integer that varies between 0 and 128, the total storage space required is 9 * (2 +2) χ 16 χ 7 = 4032 bits. The I8MB mode has 9 prediction directions, so if 4 candidate row transformation matrices and 4 candidate column transformation matrices exist in each direction and each element of the transformation matrix is an integer ranging from 0 to 128 , the total storage space required is 9 χ (4 +4) χ 64 χ 7 = 32256 bits. Likewise, the 116MB mode has 4 prediction directions, so if 8 candidate row transformation matrices and 8 candidate column transformation matrices exist in each direction, the total storage space needed is
4 χ (8 8) χ 256 χ 7 = 114688 bits. Therefore, the total storage space required per I4MB, 116MB, and I8MB is 150976 bits, that is, 18.42 KB. In addition, the space occupied by the matrix that records the coefficient scan order is much smaller than the space occupied by the transformation matrix, and is not analyzed hereinafter.
On the decoder side, the complexity of the method disclosed in this modality differs from the complexity of the MDDT transformation method in the following three aspects:
22/43 (1) Regarding the method provided in this modality, the encoder needs to record the newly added element of syntax RDOTON in the macroblock header information of each macroblock through entropy coding. If RDOT_ON = 1, the encoder additionally needs to perform entropy coding on the transformation matrix index number used by each block in the macroblock and record the index number in the macroblock header information. In comparison with MDDT technology, this part of the operation increases the complexity found in entropy coding for the two recently added syntax elements: RDOT ON flag and the transformation matrix index number. The additional complexity on this part is ignored in relation to the complexity of the coding process as a whole.
(2) With respect to the method described in the present embodiment of the present invention, the encoder requires additional storage space to store the candidate transformation matrix and the coefficient scan order. The required storage space is the same as in the decoder, and is 18.42 KB.
For coding intraframe, the method revealed in this modality reserves the macroblock coding modes: I4MB, 116MB, and I8MB, and adds two coding macroblock modes: I4MB RDOT, I16MB_RDOT, and I8MB_RDOT. For the two newly added macroblock encoding modes, the encoder needs to select a best transformation matrix for each residual block.
The video encoding method in this modality can select a set of better transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform encoding on the prediction residue and obtain a result of the transformation. Through such a mode for coding, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which, therefore, improves the coding efficiency.
23/43
As shown in figure 4, a video data encoder provided in an embodiment of the present invention includes:
a residue generation unit 401, configured to generate a prediction residue according to the input video data;
a transformation unit 402, configured to select a set of best transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform coding on the prediction residue and obtain a result transformation unit, and a 403 flow generating unit, configured to generate a flow encoded according to the result of the transformation and selected transformation matrix index information.
Transformation unit 402 is specifically configured to: cross all combinations of the column transformation matrix and .15 the line transformation matrix in multiple candidate transformation matrices according to the intraframe prediction mode, select the transformation combination with the cost of minimal distortion rate after matrix multiplication as a better transformation coefficient matrix, and obtain a transformation result.
In addition, as shown in figure 5, the video data encoder additionally includes:
a coefficient scanning unit 501, configured to select a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to scan the transformed coefficient;
a judgment unit 502, configured to determine the mode with the minimum distortion rate cost as an intraframe prediction mode after the prediction residue is encoded in various encoding modes, and obtains an encoding result, and a 503 index encoding, configured to record the transformation coefficient matrix index information in the encoded data.
24/43
The video encoder provided in this modality can select a set of best transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform encoding on the prediction residue and obtain a result of transformation. Through such a mode for coding, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which, therefore, improves the coding efficiency.
The following is details about the method of video decoding provided in one embodiment of the present invention with reference to figure 4 and figure 5:
As shown in figure 4, a video data encoder provided in an embodiment of the present invention includes:
a residue generation unit 401, configured to generate a prediction residue according to the input video data;
a transformation unit 402, configured to select a set of best transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform coding on the prediction residue and obtain a result transformation unit, and a 403 flow generating unit, configured to generate a flow encoded according to the result of the transformation and selected transformation matrix index information.
In this embodiment, the selected set of best transformation matrices can be a non-separate transformation matrix or it can be a pair of transformation matrices, which include a column transformation matrix and a row transformation matrix.
In this modality, the transformation unit 402 selects a set of better transformation matrices among multiple candidate transformation matrices according to the intraframe prediction mode and the distortion rate criteria to perform trans coding
25/43 formed in the prediction residue and obtain a transformation result. In other words, according to the intraframe prediction mode, transform coding is performed on the prediction residue using multiple candidate transformation matrices, a set of better transformation matrices is selected according to the distortion rate criteria, and the corresponding transformation result for the set of best transformation matrices is used together with the transformation matrix index information selected subsequently to generate an encoded flow.
The transformation unit 402 is specifically configured to: cross all combinations of the column transformation matrix and the line transformation matrix in multiple candidate transformation matrices according to the intraframe prediction mode, select the transformation combination with the cost of minimal distortion rate after matrix multiplication as a better transformation matrix, and obtain a transformation result. In other words, the details are: according to the intraframe prediction mode, cross all combinations of the column transformation matrix and the line transformation matrix in multiple candidate transformation matrices, select the transformation combination with the cost minimum distortion rate after encoding residual transform as a best transformation matrix, and use the transformation result corresponding to the set of best transformation matrices together with the transformation matrix index information selected subsequently to generate an encoded flow.
In addition, as shown in figure 5, the video data encoder also includes:
a coefficient scanning unit 501, configured to select a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to scan the transformed coefficient;
a 502 trial unit, set up to determine the
26/43 mode with the minimum distortion rate cost as an intraframe prediction mode after the prediction residue is encoded in various encoding modes, and obtain an encoding result.
an index encoding unit 503, configured to record the transformation matrix index information for the encoded data.
If the set of best transformation matrices is a pair of transformation matrices, recording the transformation includes matrix index information in the encoded data: encoding the index information of a pair of transformation matrices together, or encoding the information of the matrix index a pair of transformation matrices separately, and record the encoding result of the index information in the encoded data.
Joint encoding indicates that the column transformation matrix and the line transformation matrix appear in pairs, and each line transformation matrix corresponds to a column transformation matrix; separate coding indicates that a column transformation matrix does not necessarily correspond to a row transformation matrix. For example, a row transformation matrix can correspond to a random column transformation matrix, which can save storage space for transformation matrices.
The video encoder provided in this modality can select a set of best transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform encoding on the prediction residue and obtain a result of transformation. Through such a coding mode, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which, therefore, improves the coding efficiency.
As shown in figure 6, a video decoder provided in an embodiment of the present invention includes:
27/43 a resolution unit 601, configured for resolution of a video stream to obtain a calculation result and encode transform coefficient matrix index information;
a determination unit 602, configured to determine a transformation coefficient matrix between the multiple candidate transformation matrices according to the index information and an intraframe prediction mode, and a reconstruction unit 603, configured to use the matrix of transformation transformation coefficient to perform the reverse transformation for the calculation result to obtain residual data, and to reconstruct the video data according to the residual data.
In addition, as shown in figure 7, the video decoder also includes:
an inverse coefficient scanning unit 701, configured to select a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information coefficient to perform inverse coefficient scanning on the coefficient transformed.
The video decoder provided in this modality can solve the encoded video stream to obtain a calculation result and encode transformation coefficient matrix index information, determine the transformation coefficient matrix between multiple candidate transformation matrices according to the index information and intraframe prediction mode, the transformation coefficient matrix performs reverse transformation on the calculation result to obtain residual data, and to reconstruct the video data according to the residual data. In this way, decoding is performed without increasing complexity. Because the coding is based on the method provided in the previous modality, the best transformation matrix can be selected with respect to the residual characteristics, and the entropy coding efficiency is improved. In addition, through the decoding method provided in the present modality, the efficiency of video encoding and decoding can be
28/43 improved as a whole.
The following gives more details about the method of video decoding provided in an embodiment of the present invention with reference to figure 6 and figure 7:
As shown in figure 6, a video decoder provided in an embodiment of the present invention includes:
a resolution unit 601, configured to resolve a video stream to obtain a calculation result and transformation matrix index information;
a determination unit 602, configured to determine a transformation matrix between multiple candidate transformation matrices according to the transformation matrix index information and the intraframe prediction mode, and a reconstruction unit 603, configured to use the matrix determined transformation to perform reverse transformation on the calculation result to obtain residual data, and reconstruct the video data according to the residual data.
In this mode, the result obtained after the resolution unit 601 is resolved includes the result of the transformation. That is, the calculation result used in this modality is the result of the transformation. The transformation result can include the transformation coefficient matrix obtained after transformation.
In this embodiment, the given transformation matrix is a set of transformation matrices, and the set of transformation matrices can be a non-separate transformation matrix or it can be a pair of transformation matrices, which include a column transformation matrix and a line transformation matrix.
If separate transformation is applied in the coding transformation process, the determination unit 602 is configured to determine the transformation matrix between a set of candidate row transformation matrices and column transformation matrices according to the matrix index information line transformation and
29/43 column transformation matrix index information in the transformation matrix index information, and the intraframe prediction mode. The set of candidate row transformation matrices and column transformation matrices here can include multiple row transformation matrices and column transformation matrices. The reconstruction unit 603 uses the line transformation matrix and the column transformation matrix to perform the reverse transformation on the calculation result to obtain residual data, and to reconstruct the video data according to the residual data.
In addition, as shown in figure 7, the video decoder includes:
an inverse coefficient scanning unit 701, configured to select a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to perform inverse coefficient scanning on the transformed coefficient.
The video decoder provided in this modality can solve the encoded video stream to obtain a result of calculation and transformation matrix index information, determine the transformation matrix among multiple candidate transformation matrices according to the index information and the Intra-frame prediction mode, use the transformation matrix to perform the inverse transformation on the calculation result to obtain residual data, and reconstruct the video data according to the residual data. In this way, decoding is performed without increasing complexity. Because the coding is based on the method provided in the previous modality, the best transformation matrix can be selected with respect to the residual characteristics, and the entropy coding efficiency is improved. In addition, through the decoding method provided in the present modality, the efficiency of video encoding and decoding can be improved η the whole.
Another method for encoding video data is provided in an embodiment of the present invention. As shown in the figure
30/43
8, the method includes the following steps:
S801: Generate a prediction residue according to input video data.
S802: Select a set of best transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and optimization criteria to perform transform coding in the prediction residue and obtain a transformation result.
In this embodiment, the selected set of best transformation matrices can be a non-separate transformation matrix or it can be a pair of transformation matrices, which include a column transformation matrix and a row transformation matrix.
The optimization criteria can be the rate of criteria distortion, sum of the absolute difference (SAD, sum of the absolute difference), bit code, or distortion. Selection according to the optimization criteria can include many ways, for example, selecting the one with the minimum distortion rate cost, selecting the one with the minimum SAD, selecting the one with the minimum bit code, or selecting the one with the distortion minimum.
In this modality, a set of better transformation matrices is selected among the multiple candidate transformation matrices according to the intraframe prediction mode and the optimization criteria to perform transform coding in the prediction residue and obtain a transformation result. In other words, according to the intraframe prediction mode, transform coding is performed on the prediction residue using multiple candidate transformation matrices, a set of better transformation matrices is selected according to the optimization criteria, and the result transformation data corresponding to the set of best transformation matrices is used together with the transformation matrix index information selected subsequently to generate an encoded flow.
In certain modes of execution, in the transformation process, the
31/43 how to separate the column from the row can be applied. That is: according to the intraframe prediction mode, cross all possible combinations of the column transformation matrix and the line transformation matrix in multiple candidate transformation matrices, select the transformation combination with the minimum cost of optimization and after coding the waste transform as a better transformation matrix, and obtain a transformation result. In other words, the details are: according to the intraframe prediction mode, cross all combinations of the column transformation matrix and the line transformation matrix in multiple candidate transformation matrices, select the transformation combination with the cost minimum optimization criteria after coding the residue transform as a better transformation matrix, and using the transformation result corresponding to the set of best transformation matrices together with the transformation matrix index information selected subsequently to generate a coded flow .
S803: Generate a coded flow according to the transformation result and selected transformation matrix index information.
In addition, this modality can also include a coefficient scanning process: selecting a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to scan the transformed coefficient.
Then, the one with the minimum cost of optimization criteria after the transformation is selected as the best mode of intraframe prediction, and its result is quantized and then undergo entropy coding. That is, the prediction residue is coded in several coding modes, the mode with the minimum cost of optimization criteria is selected as the intraframe prediction mode, and a coding result is obtained.
In this modality, the flow generator coded according to
32/43 the transformation result and the selected transformation matrix index information includes: recording the transformation matrix index information in the encoded data.
If the set of best transformation matrices is a pair of transformation matrices, recording the transformation matrix index information in the encoded data includes: index encoding information for a pair of transformation matrices together, or encoding the information index of a pair of transformation matrices separately, and record the encoding result of the index information in the coded da10.
Joint coding indicates that the column transformation matrix and the line transformation matrix appear in pairs, and each line transformation matrix corresponds to a column transformation matrix; separate coding indicates that a 15 column transformation matrix does not necessarily correspond to a row transformation matrix. For example, a transformation matrix line can give the corre r a column random transformation matrix, which can save storage space of transformation matrices.
According to the video encoding method in this modality, a set of better transformation matrices can be selected from multiple candidate transformation matrices according to an intraframe prediction mode and optimization criteria to perform transform encoding on the residue of prediction and obtain a transformation result. Through such a coding, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which, therefore, improves the coding efficiency.
Another method of video decoding is provided in an embodiment of the present invention. As shown in figure 9, method 30 includes the following steps:
S901: Resolution of an encoded video stream to obtain a transformation result and transformation matrix index information33 / 43 tion.
In addition, this modality also includes an inverse coefficient scanning process: select a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to perform inverse coefficient scanning on the transformed coefficient.
S902: Determine a set of transformation matrices between multiple candidate transformation matrices according to the transformation matrix index information and an intraframe prediction mode, use the transformation matrix set to perform the inverse transformation on the result of the transformation to obtain residual data, and reconstruct the video data according to the residual data.
In this embodiment, the given set of transformation matrices can be a non-separate transformation matrix or it can be a pair of transformation matrices, which include a column transformation matrix and a row transformation matrix.
In this modality, the result obtained after the resolution includes the result of the transformation. That is, the transformation result used in this modality is the result of the calculation. The transformation result can include the transformation coefficient matrix obtained after transformation.
Specifically, if the separate transformation is applied in the coding transformation process, the set of transformation matrices in step S902 can be determined among several candidate row transformation matrices and column transformation matrices according to the matrix index information line transformation and column transformation matrix index information in the transformation matrix index information, and the intraframe prediction mode.
According to the video decoding method provided in this modality, the encoded video stream can be resolved to obtain a result of the transformation and matrix index information.
34/43 transformation, a set of transformation matrices is determined among multiple candidate transformation matrices according to transformation matrix index information and the intraframe prediction mode, the transformation matrix is used to perform the inverse transformation for the result of the transformation to obtain residual data, and the video data is reconstructed according to the residual data. In this way, decoding is performed without increasing complexity. Because the coding is based on the method provided in the previous modality, the best transformation matrix can be selected with respect to the residual characteristics, and the entropy coding efficiency is improved. In addition, through the decoding method provided in the present modality, the efficiency of video encoding and decoding can be improved as a whole.
As shown in figure 10, a method for encoding video data in an embodiment of the present invention includes the following steps:
S1001: Generate a prediction residue according to input video data.
S1002: Select a set of best transformation matrices among multiple candidate transformation matrices according to optimization criteria to perform transform coding in the prediction residue and obtain a transformation result.
In this embodiment, the selected set of best transformation matrices can be a non-separate transformation matrix or it can be a pair of transformation matrices, which include a column transformation matrix and a row transformation matrix.
The optimization criteria can be distortion rate criteria, sum of absolute difference (SAD, sum of absolute difference), bit code, or distortion. Selection according to the optimization criteria can include many ways, for example, selecting the one with the minimum distortion rate cost, selecting the one with the minimum SAD, selecting the one with the minimum bit code, or selecting one with the distortion minimum 35/43.
In this modality, a set of better transformation matrices is selected among the multiple candidate transformation matrices according to the optimization criteria to perform transform coding in the prediction residue and obtain a transformation result. In other words, transform coding is performed on the prediction residue using multiple candidate transformation matrices, a set of best transformation matrices is selected according to the optimization criteria, and the transformation result corresponding to the set of best transformation matrices it is used together with the transformation matrix index information selected subsequently to generate an encoded stream.
In certain implementation modes, in the transformation process, the way of separating the column from the row can also be applied. That is: cross all possible combinations of the column transformation matrix and the row transformation matrix in multiple candidate transformation matrices, select the transformation combination with the minimum cost of optimization criteria after coding the residual transform as a best transformation matrix and obtain a transformation result. In other words, the details are: cross all combinations of the column transformation matrix and the row transformation matrix in multiple candidate transformation matrices, select the transformation combination with the minimum cost of optimization criteria after transform coding residual as a better transformation matrix and use the transformation result corresponding to the set of better transformation matrices together with the transformation matrix index information selected subsequently to generate a coded flow.
S1003: Encode selected transformation matrix index information according to the transformation result and an intraframe prediction mode to generate an encoded flow.
In addition, this modality may also include a process
36/43 coefficient scan: select a scan order from a set of coefficients according to the transformation matrix index information to scan the transformed coefficient.
Then, the one with the minimum cost of optimization criteria after transformation is selected as the best mode of intraframe prediction, and its result is quantized and then undergoes entropy coding. That is, the prediction residue is coded in several coding modes, the mode with the minimum cost of optimization criteria is selected as the intraframe prediction mode, and a coding result is obtained.
In this modality, the coding of the transformation matrix index information selected according to the transformation result and the intraframe prediction mode to generate the coded flow includes: selecting, according to the selected intraframe prediction mode, a method for encoding the transformation matrix index information, for recording the transformation matrix index information in the encoded data. For different intraframe prediction modes, different methods for encoding the transformation matrix index information can be selected to record the transformation matrix index information for the encoded data. If the set of best transformation matrices is a pair of transformation matrices, the transformation recording includes matrix index information for the encoded data: encoding the index information of a pair of transformation matrices together, or encoding the information index of a pair of transformation matrices separately, and record the result of encoding the index information for the data encoded according to the intraframe prediction mode.
Joint encoding indicates that the column transformation matrix and the line transformation matrix appear in pairs, and each line transformation matrix corresponds to a column transformation matrix; separate coding indicates that a column transformation matrix does not necessarily correspond to a column transformation matrix
37/43 line. For example, a row transformation matrix can correspond to a random column transformation matrix, which can save storage space for transformation matrices.
According to the video encoding method provided in this modality, a set of better transformation matrices can be selected from multiple candidate transformation matrices according to the optimization criteria to perform transform encoding in the prediction residue and obtain a result of the transformation. Through such a mode for coding, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which, therefore, improves the coding efficiency.
As shown in figure 11, a video decoding method provided in an embodiment of the present invention includes the following steps:
S1101: Resolution of an encoded video stream to obtain a transformation result, and obtain transformation matrix index information according to an intraframe prediction mode.
In this modality, the result obtained after the resolution includes the result of the transformation. That is, the transformation result used in this modality is the result of the calculation. The transformation result can include the transformation coefficient matrix obtained after transformation. Obtaining the transformation matrix index information according to the intraframe prediction mode includes: selecting, according to the intraframe prediction mode, a method for decoding the transformation matrix index information, and obtaining the information from transformation matrix index through decoding. For different intraframe prediction modes, different resolution methods can be applied to solve the video stream and obtain the transformation matrix index information.
In addition, this mode also includes an inverse coefficient scanning process: selecting a scan order from one
38/43 set of coefficients according to the transformation matrix index information to perform an inverse coefficient scan over the transformed coefficient.
S1102: Determine a transformation matrix between multiple candidate transformation matrices according to the transformation matrix index information, use the determined transformation matrix to perform reverse transformation on the transformation result to obtain residual data, and reconstruct the data from video according to residual data.
In this embodiment, the given transformation matrix can be a set of transformation matrices, and the set of transformation matrices can be a non-separate transformation matrix or it can be a pair of transformation matrices, which include a column transformation matrix. and a line transformation matrix.
Specifically, if the separate transformation is applied in the decoding transformation process, the transformation matrix in step S1102 can be determined between a set of candidate row transformation matrices and column transformation matrices according to the matrix index information line transformation and column transformation matrix index information in the information index. The set of candidate row transformation matrices and column transformation matrices here include multiple row transformation matrices and column transformation matrices.
According to the video decoding method provided in this modality, the encoded video stream can be resolved to obtain a transformation result, and the transformation matrix index information is obtained through resolution according to the prediction mode. intraframe; a transformation matrix is determined among multiple candidate transformation matrices according to the transformation matrix index information, the transformation matrix is used to perform the reverse transformation on the transformation result to obtain residual data, and the video data are rebuilt according to
39/43 the residual data. In this way, decoding is performed without increasing complexity. Because the coding is based on the method provided in the previous modality, the best transformation matrix can be selected with respect to the residual characteristics, and the entropy coding efficiency is improved. In addition, through the decoding method provided in the present modality, the efficiency of video encoding and decoding can be improved as a whole.
As shown in figure 12, a video data encoder provided in an embodiment of the present invention includes:
a waste generating unit 1201, configured to generate a prediction waste according to the input video data;
a transformation unit 1202, configured to select a set of better transformation matrices among multiple candidate transformation matrices according to optimization criteria to perform transform coding in the prediction residue and obtain a transformation result; and a flow generating unit 1203, configured to encode transformation matrix index information selected according to the transformation result and an intraframe prediction mode to generate an encoded flow.
In this embodiment, the selected set of best transformation matrices can be a non-separate transformation matrix or it can be a pair of transformation matrices, which include a column transformation matrix and a row transformation matrix. The optimization criteria include: rate of criteria distortion, sum of absolute difference (SAD), bit code, or distortion.
In this modality, transformation unit 1202 is specifically configured to: cross all combinations of the column transformation matrix and the line transformation matrix in multiple candidate transformation matrices, select the transformation combination with the minimum cost of optimization criteria after coding residual transform as a better transformation matrix, and obtain
40/43 a result of the transformation.
In this modality, the encoding, by the flow generation unit 1203, of the transformation matrix index information selected according to the transformation result and the intraframe prediction mode to generate the encoded flow, includes: selecting, according to the selected intraframe prediction mode, a method for encoding the transformation matrix index information, for recording the transformation matrix index information in the encoded data.
In addition, as shown in figure 13, the video data encoder additionally includes:
a coefficient scanning unit 1301, configured to select a scan order from a set of coefficients according to the transformation matrix index information to scan the transformed coefficient;
a judgment unit 1302, configured to determine the mode with the minimum cost of optimization criteria as an intraframe prediction mode after the prediction residue is encoded in various encoding modes, and obtaining an encoding result; index coding 1303, configured to select, according to the selected intraframe prediction mode, a method for encoding the transformation matrix index information, to record the transformation matrix index information in the encoded data.
If the set of best transformation matrices is a pair of transformation matrices, the selection, according to the selected intraframe prediction mode, of a method to decode the transformation matrix index information, to record the index information transformation matrix in the encoded data, includes: encoding the index information of a pair of transformation matrices together, or encoding the index information of a pair of transformation matrices separately, and selecting a method for encoding the index information transformation matrix according to the prediction mode of in
41/43 selected trachogram to record the transformation matrix index information in the encoded data.
Joint encoding indicates that the column transformation matrix and the line transformation matrix appear in pairs, and each line transformation matrix corresponds to a column transformation matrix; separate coding indicates that a column transformation matrix does not necessarily correspond to a row transformation matrix. For example, a row transformation matrix can correspond to a random column transformation matrix, which can save storage space for transformation matrices.
The video encoder provided in this modality can select a set of better transformation matrices among multiple candidate transformation matrices according to the optimization criteria to perform transform coding in the prediction residue and obtain a transformation result. Through such a mode for coding, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which, therefore, improves the coding efficiency.
The video encoder provided in this modality can select a set of better transformation matrices among multiple candidate transformation matrices according to an intraframe prediction mode and distortion rate criteria to perform transform encoding on the prediction residue and obtain a result of transformation. Through such a mode for coding, the most efficient transformation matrix that corresponds to the characteristics of each residual block is selected for the transformation, which, therefore, improves the coding efficiency.
As shown in figure 14, a video decoder provided in an embodiment of the present invention includes:
a resolution unit 1401, configured for resolution of a video stream to obtain a transformation result, and obtain transformation matrix index information according to a
42/43 intraframe prediction;
a determination unit 1402, configured to determine a transformation matrix between multiple candidate transformation matrices according to the transformation matrix index information, and a reconstruction unit 1403, configured to use the determined transformation matrix to perform reverse transformation on the result of the transformation to obtain residual data, and to reconstruct the video data according to the residual data.
In this embodiment, the given transformation matrix is a set of transformation matrices, and the set of transformation matrices can be a non-separate transformation matrix or it can be a pair of transformation matrices, which include a column transformation matrix and a line transformation matrix.
In this modality, the result obtained after the resolution includes the result of the transformation. That is, the transformation result used in this modality is the result of the calculation. The transformation result can include the transformation coefficient matrix obtained after transformation.
The obtaining, by the resolution unit 1401, of the transformation matrix index information according to the intraframe prediction mode includes: selecting a method to decode the transformation matrix index information according to the intraframe prediction mode , and obtain the transformation matrix index information through decoding.
In addition, as shown in figure 15, the video decoder includes:
an inverse coefficient scanning unit 1501, configured to select a scan order from a set of coefficients according to the transformation matrix index information to perform inverse coefficient scanning on the transformed coefficient.
The video decoder provided in this mode can
43/43 resolving the encoded video stream to obtain a transformation result, and obtaining the transformation matrix index information through resolution according to the intraframe prediction mode; determine a transformation matrix among the multiple candidate transformation matrices according to the transformation matrix index information, use the transformation matrix to perform the reverse transformation on the transformation result to obtain residual data, and reconstruct the video data according to residual data. In this way, decoding is performed without increasing complexity. Because the coding is based on the method provided in the previous modality, the best transformation matrix can be selected with respect to the residual characteristics, and the entropy coding efficiency is improved. In addition, through the decoding method provided in the present modality, the efficiency of video encoding and decoding can be improved as a whole.
Persons skilled in the art should understand that all or part of the steps of the method according to the modalities of the present invention can be implemented by relevant hardware with program instructions. The program can be stored on computer-readable storage media. When the program is executed, the program performs the steps of the method specified in the previous embodiments of the present invention. The storage media can be any media capable of storing program codes, such as ROM, RAM, magnetic disk, or optical disk.
The above descriptions are merely preferred embodiments of the present invention, but are not intended to limit the scope of protection of the present invention. Any modifications, variations or substitutions that can be easily obtained by those skilled in the art without departing from the idea of the present invention must fall within the scope of protection of the present invention. Therefore, the scope of protection of the present invention is subject to the appended claims.
权利要求:
Claims (11)
[1]
1. Video data encoder, CHARACTERIZED by the fact that it comprises:
a waste generation unit, configured to generate a prediction waste according to the input video data;
a transformation unit, configured to cross all combinations of a column transformation matrix and a line transformation matrix in multiple candidate transformation matrices according to an intraframe prediction mode, select a transformation combination with a cost of minimum distortion rate after matrix multiplication as a better transformation matrix, and obtain a transformation result; and a flow generation unit, configured to generate a flow encoded according to the transformation result and selected transformation matrix index information.
[2]
2. Video data encoder, according to claim 1, CHARACTERIZED by the fact that it additionally comprises:
a coefficient scanning unit, specifically configured to select a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to scan a transformed coefficient.
[3]
3. Video data encoder, according to claim 1, CHARACTERIZED by the fact that it additionally comprises:
a judgment unit, configured to determine a mode with a minimal distortion rate cost as an intraframe prediction mode after the prediction residue is encoded in various encoding modes, and obtain an encoding result.
[4]
4. Video data encoder, according to claim 1, CHARACTERIZED by the fact that it additionally comprises:
an index encoding unit, configured to record the transformation coefficient matrix index information into encoded data.
Petition 870190017594, of 02/21/2019, p. 8/10
2/3
[5]
5. Method for encoding video data, CHARACTERIZED by the fact that it comprises:
generate a prediction residue according to the input video data;
cross all combinations of a column transformation matrix and a row transformation matrix in multiple candidate transformation matrices according to an intraframe prediction mode, select a transformation combination with a minimum cost of optimization criteria after coding residual transformed as a better transformation matrix, and obtain a transformation result; and generate a stream encoded according to the transformation result and selected transformation matrix index information.
[6]
6. Method for encoding video data, according to claim 5, CHARACTERIZED by the fact that:
the best transformation matrices is a pair of transformation matrices that comprises a column transformation matrix and a row transformation matrix.
[7]
7. Method for encoding video data, according to claim 5, CHARACTERIZED by the fact that:
the optimization criteria include: distortion rate criteria, sum of absolute difference (SAD), bit code, or distortion.
[8]
8. Method for encoding video data, according to claim 5, CHARACTERIZED by the fact that it additionally comprises:
select a scan order from a set of coefficients according to the intraframe prediction mode and the transformation matrix index information to scan a transformed coefficient.
[9]
9. Method for encoding video data, according to claim 5, CHARACTERIZED by the fact that it additionally comprises:
encode the prediction residue in various encoding modes, select a mode with a minimum cost of optimization criteria such as the intraframe prediction mode, and obtain an encoding result.
Petition 870190017594, of 02/21/2019, p. 9/10
3/3
[10]
10. Method for encoding video data according to claim 5 or 6, CHARACTERIZED by the fact that: the generation of the encoded stream according to the transformation result and the selected transformation matrix index information comprises:
write the transformation matrix index information to encoded data.
[11]
11. Method for encoding video data, according to claim 10, CHARACTERIZED by the fact that: if the best transformation matrices are a pair of transformation matrices, the recording of the transformation matrix index information in the encoded data comprises :
encode index information for a pair of transformation matrices together, or encode index information for a pair of transformation matrices separately; and recording a result of encoding the index information in the encoded data.
类似技术:
公开号 | 公开日 | 专利标题
BR112012011325B1|2019-04-30|VIDEO DATA ENCODING METHOD, VIDEO DECODING METHOD, VIDEO DATA ENCODER AND VIDEO DECODER
KR101697153B1|2017-01-17|Super macro block based intra coding method and apparautus
KR101456498B1|2014-10-31|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101703327B1|2017-02-06|Method and apparatus for video encoding using pattern information of hierarchical data unit, and method and apparatus for video decoding using pattern information of hierarchical data unit
KR101675118B1|2016-11-10|Method and apparatus for video encoding considering order of skip and split, and method and apparatus for video decoding considering order of skip and split
KR20110017719A|2011-02-22|Method and apparatus for video encoding, and method and apparatus for video decoding
KR20110116025A|2011-10-24|Method and apparatus for video encoding using low-complexity frequency transform, and method and apparatus for video decoding using the same
KR20110010324A|2011-02-01|Method and apparatus for image encoding, and method and apparatus for image decoding
KR20110017720A|2011-02-22|Method and apparatus for video encoding considering adaptive loop filtering, and method and apparatus for video decoding considering adaptive loop filtering
KR20130116215A|2013-10-23|Method and apparatus for encoding/decoding video for parallel processing
KR101731430B1|2017-04-28|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101857800B1|2018-05-14|Method and apparatus for video encoding considering order of skip and split, and method and apparatus for video decoding considering order of skip and split
KR20150045973A|2015-04-29|Method and apparatus for video encoding, and method and apparatus for video decoding
KR101842262B1|2018-03-26|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101783966B1|2017-10-10|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR20180032542A|2018-03-30|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101607311B1|2016-03-29|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101607312B1|2016-03-29|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101607310B1|2016-03-29|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR20160092970A|2016-08-05|Method and apparatus for video encoding, and method and apparatus for video decoding
KR101454642B1|2014-10-27|Method and apparatus for video encoding, and method and apparatus for video decoding
KR20140060479A|2014-05-20|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR20150034707A|2015-04-03|Method and apparatus for video encoding considering order of skip and split, and method and apparatus for video decoding considering order of skip and split
KR20130036753A|2013-04-12|Method and apparatus for video encoding, and method and apparatus for video decoding
同族专利:
公开号 | 公开日
AU2010310286B2|2014-05-01|
EP2493197A1|2012-08-29|
US9723313B2|2017-08-01|
BR112012011325A2|2016-04-19|
CN102045560A|2011-05-04|
CN102045560B|2013-08-07|
KR101481642B1|2015-01-22|
US20120201303A1|2012-08-09|
WO2011047579A1|2011-04-28|
AU2010310286A1|2012-06-14|
EP2493197A4|2014-07-16|
KR20120060914A|2012-06-12|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

JP3887178B2|2001-04-09|2007-02-28|株式会社エヌ・ティ・ティ・ドコモ|Signal encoding method and apparatus, and decoding method and apparatus|
JP4447197B2|2002-01-07|2010-04-07|三菱電機株式会社|Moving picture encoding apparatus and moving picture decoding apparatus|
US20050213835A1|2004-03-18|2005-09-29|Huazhong University Of Science & Technology And Samsung Electronics Co., Ltd.|Integer transform matrix selection method in video coding and related integer transform method|
CN100433837C|2004-03-18|2008-11-12|华中科技大学|Integral conversing matrix selection method of video coding and related integral conversion method|
US8179962B2|2004-09-08|2012-05-15|Panasonic Corporation|Motion image encoding method and motion image decoding method|
US7953155B2|2004-12-16|2011-05-31|Tut Systems, Inc.|Methods of selecting an encoding mode|
CN100564602C|2006-07-05|2009-12-02|中国石油化工股份有限公司|The preparation method of a kind of chlorine rhodium acid|
CA2680140A1|2007-04-16|2008-11-06|Kabushiki Kaisha Toshiba|Image encoding and decoding method and apparatus|
US8406299B2|2007-04-17|2013-03-26|Qualcomm Incorporated|Directional transforms for intra-coding|
US8428133B2|2007-06-15|2013-04-23|Qualcomm Incorporated|Adaptive coding of video block prediction mode|
CN101489134B|2009-01-16|2010-06-16|华中科技大学|KLT matrix training method for video interframe coding|
CN102045560B|2009-10-23|2013-08-07|华为技术有限公司|Video encoding and decoding method and video encoding and decoding equipment|CN102045560B|2009-10-23|2013-08-07|华为技术有限公司|Video encoding and decoding method and video encoding and decoding equipment|
CN103250412A|2010-02-02|2013-08-14|数码士有限公司|Image encoding/decoding method for rate-istortion optimization and apparatus for performing same|
CN102281435B|2010-06-11|2013-10-02|华为技术有限公司|Encoding method, decoding method, encoding device, decoding device and encoding/decoding system|
US9049444B2|2010-12-22|2015-06-02|Qualcomm Incorporated|Mode dependent scanning of coefficients of a block of video data|
RU2619706C2|2011-06-28|2017-05-17|Самсунг Электроникс Ко., Лтд.|Method and device for encoding video, and method and device for decoding video which is accompanied with internal prediction|
CN102857755B|2011-07-01|2016-12-14|华为技术有限公司|The method and apparatus determining transform block size|
CN107277514B|2011-10-19|2019-08-16|株式会社Kt|The method of decoding video signal|
KR20130049522A|2011-11-04|2013-05-14|오수미|Method for generating intra prediction block|
CN103096053B|2011-11-04|2015-10-07|华为技术有限公司|A kind of decoding method of pattern conversion and device|
RS60786B1|2012-01-20|2020-10-30|Dolby Laboratories Licensing Corp|Intra prediction mode mapping method|
CN103533324B|2012-07-03|2017-04-05|乐金电子研究开发中心有限公司|A kind of depth image inner frame coding method, device and encoder|
KR101431463B1|2012-07-11|2014-08-22|세종대학교산학협력단|Apparatus for lossless video coding/decoding and method thereof|
US10230956B2|2012-09-26|2019-03-12|Integrated Device Technology, Inc.|Apparatuses and methods for optimizing rate-distortion of syntax elements|
CN104244010B|2013-06-14|2018-03-23|浙江大学|Improve the method and digital signal converting method and device of digital signal conversion performance|
CN104853196B|2014-02-18|2018-10-19|华为技术有限公司|Decoding method and device|
WO2016043417A1|2014-09-19|2016-03-24|엘지전자|Method and apparatus for encoding and decoding video signal adaptively on basis of separable transformation|
WO2016167538A1|2015-04-12|2016-10-20|엘지전자|Method for encoding and decoding video signal, and apparatus therefor|
FR3035761A1|2015-04-30|2016-11-04|Orange|IMAGE ENCODING AND DECODING METHOD, IMAGE ENCODING AND DECODING DEVICE AND CORRESPONDING COMPUTER PROGRAMS|
FR3038196A1|2015-06-29|2016-12-30|B<>Com|METHOD FOR ENCODING A DIGITAL IMAGE, DECODING METHOD, DEVICES AND COMPUTER PROGRAMS|
WO2017023152A1|2015-08-06|2017-02-09|엘지전자|Device and method for performing transform by using singleton coefficient update|
EP3389270A4|2016-02-12|2019-02-27|Samsung Electronics Co., Ltd.|Image encoding method and apparatus, and image decoding method and apparatus|
US10390048B2|2016-02-15|2019-08-20|Qualcomm Incorporated|Efficient transform coding using optimized compact multi-pass transforms|
CN105791867B|2016-03-23|2019-02-22|北京大学|Optimization method for coding video data based on Boundary adaptation transformation|
EP3485637A1|2016-07-14|2019-05-22|Fraunhofer Gesellschaft zur Förderung der Angewand|Predictive picture coding using transform-based residual coding|
TW201815164A|2016-10-07|2018-04-16|財團法人工業技術研究院|Method for selecting prediction mode of intra prediction, video encoding device and image processing apparatus|
EP3586511B1|2017-03-16|2022-01-05|MediaTek Inc.|Method and apparatus of enhanced multiple transforms and non-separable secondary transform for video coding|
US10574959B2|2017-07-05|2020-02-25|Qualcomm Incorporated|Color remapping for non-4:4:4 format video content|
CN109922348B|2017-12-13|2020-09-18|华为技术有限公司|Image coding and decoding method and device|
WO2019194420A1|2018-04-01|2019-10-10|엘지전자 주식회사|Image coding method and device on basis of transform indicator|
CN111225206B|2018-11-23|2021-10-26|华为技术有限公司|Video decoding method and video decoder|
CN111277840B|2018-12-04|2022-02-08|华为技术有限公司|Transform method, inverse transform method, video encoder and video decoder|
KR20210098967A|2019-01-01|2021-08-11|엘지전자 주식회사|Video coding method and apparatus based on quadratic transformation|
CN109819250B|2019-01-15|2020-09-25|北京大学|Method and system for transforming multi-core full combination mode|
CN109788286B|2019-02-01|2021-06-18|北京大学深圳研究生院|Encoding and decoding transformation method, system, equipment and computer readable medium|
WO2020197274A1|2019-03-26|2020-10-01|엘지전자 주식회사|Transform-based image coding method and device therefor|
KR20210145755A|2019-04-12|2021-12-02|베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드|Interaction between matrix-based intra prediction and other coding tools|
法律状态:
2018-03-27| B15K| Others concerning applications: alteration of classification|Ipc: H04N 19/136 (2014.01), H04N 19/122 (2014.01), H04N |
2019-02-05| B06A| Notification to applicant to reply to the report for non-patentability or inadequacy of the application [chapter 6.1 patent gazette]|
2019-04-09| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
2019-04-30| B16A| Patent or certificate of addition of invention granted|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 30/08/2010, OBSERVADAS AS CONDICOES LEGAIS. (CO) 20 (VINTE) ANOS CONTADOS A PARTIR DE 30/08/2010, OBSERVADAS AS CONDICOES LEGAIS |
优先权:
申请号 | 申请日 | 专利标题
CN200910209013.9|2009-10-23|
CN200910209013|2009-10-23|
CN201010147581.3|2010-04-09|
CN201010147581|2010-04-09|
CN2010102137918A|CN102045560B|2009-10-23|2010-06-17|Video encoding and decoding method and video encoding and decoding equipment|
CN201010213791.8|2010-06-17|
PCT/CN2010/076464|WO2011047579A1|2009-10-23|2010-08-30|Method and device for encoding and decoding video|
[返回顶部]